A frequency bin-wise nonlinear masking algorithm in convolutive mixtures for speech segregation.

نویسندگان

Tai-Shih Chi

Ching-Wen Huang

Wen-Sheng Chou

چکیده

A frequency bin-wise nonlinear masking algorithm is proposed in the spectrogram domain for speech segregation in convolutive mixtures. The contributive weight from each speech source to a time-frequency unit of the mixture spectrogram is estimated by a nonlinear function based on location cues. For each sound source, a non-binary mask is formed from the estimated weights and is multiplied to the mixture spectrogram to extract the sound. Head-related transfer functions (HRTFs) are used to simulate convolutive sound mixtures perceived by listeners. Simulation results show our proposed method outperforms convolutive independent component analysis and degenerate unmixing and estimation technique methods in almost all test conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multistage approach to blind separation of convolutive speech mixtures

We propose a novel algorithm for the separation of convolutive speech mixtures using two-microphone recordings, based on the combination of independent component analysis (ICA) and ideal binary mask (IBM), together with a post-filtering process in the cepstral domain. The proposed algorithm consists of three steps. First, a constrained convolutive ICA algorithm is applied to separate the source...

متن کامل

Problems in Blind Separation of Convolutive Speech Mixtures by Negentropy Maximization

This paper aims to examine suitability of the marginal statistics based contrast function e.g. negentropy for the separation of convolutive speech mixtures picked up by a linear microphone array. For this study we choose our frequency domain fixed-point ICA algorithm, based on negentropy maximization of the independent components. This algorithm is based on the heuristic assumption, in accordan...

متن کامل

Blind speech source localization, counting and separation for 2-channel convolutive mixtures in a reverberant environment

In this paper, the tasks of speech source localization, source counting and source separation are addressed for an unknown number of sources in a stereo recording scenario. In the first stage, the angles of arrival of individual source signals are estimated through a peak finding scheme applied to the angular spectrum which has been derived using non-linear GCC-PHAT. Then, based on the known ch...

متن کامل

Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain

This paper overviews a total solution for frequencydomain blind source separation (BSS) of convolutive mixtures of audio signals, especially speech. Frequency-domain BSS performs independent component analysis (ICA) in each frequency bin, and this is more efficient than time-domain BSS. We describe a sophisticated total solution for frequency-domain BSS, including permutation, scaling, circular...

متن کامل

Underdetermined Convolutive Blind Source Separation via Time-Frequency Masking

In this paper we consider the problem of separation of unknown number of sources from their underdetermined convolutive mixtures via time-frequency (TF) masking. We propose two algorithms, one for the estimation of the masks which are to be applied to the mixture in the TF domain for the separation of signals in the frequency domain, and the other for solving the permutation problem. The algori...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

The Journal of the Acoustical Society of America

دوره 131 5 شماره

صفحات -

تاریخ انتشار 2012

A frequency bin-wise nonlinear masking algorithm in convolutive mixtures for speech segregation.

نویسندگان

چکیده

منابع مشابه

A multistage approach to blind separation of convolutive speech mixtures

Problems in Blind Separation of Convolutive Speech Mixtures by Negentropy Maximization

Blind speech source localization, counting and separation for 2-channel convolutive mixtures in a reverberant environment

Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain

Underdetermined Convolutive Blind Source Separation via Time-Frequency Masking

عنوان ژورنال:

اشتراک گذاری